WHAT IS CLAIMED IS; 

1 . A method for presenting an application in a plurality of modalities, 
comprising the steps of: 

retrieving a modality-independent document from one of local and remote 
storage; 

parsing the modality-independent document using parsing rules obtained from one 
of local or remote storage; 

converting the modality-independent document to a first intermediate 
representation that can be rendered by a speech user interface modality; 

converting the modality-independent document to a second intermediate 
representation that can be rendered by a GUI (graphical user interface) modality; 

building a cross-reference table by which the speech user interface can access 
components comprising the second intermediate representation; 

rendering the first and second intermediate representations in their respective 
modality; and 

receiving a user input in one of the GUI and speech user interface modalities to 
enable muhi-modal interaction and control the document presentation. 

2. The method of claim 1 , wherein the GUI and speech user interface 
modalities are synchronized in the document presentation. 
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3, The method of claim 1, wherein the first intermediate representation is 
stored in local system memory for immediate rendering. 



4. The method of claim 1 , wherein the step of converting the 
modality-independent document to the first intermediate representation comprises 
5 transcoding the modality-independent document to a speech markup script. 



5. The method of claim 4, wherein the step of rendering comprises the step of 
_ deferred rendering of the speech markup script. 

s 

5^ 6. The method of claim 4, wherein the speech markup script is stored on a 

1^ local persistent storage device. 

5 

2 10 7. * The method of claim 4, wherein the speech markup script comprises 

: ■■■ s 

VXML (Voice extensible Markup Language). 

8. The method of claim 1, fiirther comprising the step of executing an 
applications program when a corresponding event call occurs within the 
modality-independent document. 
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9. The method of claim 8, wherein the step of executing an applications 
program comprises updating existing grammar rules with data values returned from the 
applications program. 

10. The method of claim 8, wherein the step of executing an applications 
program comprises updating content values associated with a component of the 
modality-independent docimient using data values returned from the applications 
program. 

1 1 . The method of claim 1 , further comprising the step of registering a 
program to be executed upon completion of a specified event. 

12. The method of claim 1 , wherein the modality-independent document 
comprises an intent-based document. 

13. A program storage device readable by machine, tangibly embodying a 
program of instructions executable by the machine to perform the method steps of claim 
1. 
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14. A method for providing global help information when presenting a 
modality-independent document, the method comprising the steps of: 

preparing an internal representation of a structure and component attributes of the 
modality-independent document; 

building a grammar comprising rules for resolving specific spoken requests; 

processing a spoken request utilizing the grammar rules; and 

presenting an aural description of the modality-independent document in response 
to the spoken request. 

15. The method of claim 14, wherein the step of presenting an aural 
description of the modality-independent document comprises presenting one of document 
components, attributes, methods of interaction, and a combination thereof. 

16. A program storage device readable by machine, tangibly embodying a 
program of instructions executable by the machine to perform the method steps of claim 
14. 

1 7. A method for providing contextual help information when presenting a 
modality-independent document, the method comprising the steps of: 

preparing an internal representation of a structure and component attributes of the 
modality-independent document; 

building a grammar comprising rules for resolving specific spoken requests; 
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processing a spoken request utilizing the grammar rules; and 
presenting an aural description of the components, attributes, and methods of 
- interaction of the modality-independent document in response to the spoken request. 

1 8. The method of claim 1 7, wherein the step of building a grammar 
comprises the step of combining values obtained from data stored in one of local storage, 
remote storage, and a combination thereof, with values obtained from an analysis of the 
modality-independent document. 

19. A program storage device readable by machine, tangibly embodying a 
program of instructions executable by the machine to perform the method steps of claim 
17. 

20. A method for providing feedback information when presenting a 
modality-independent document, the method comprising the steps of: 

preparing an internal representation of the structure and component attributes of 
the modality-independent document; 

building a grammar comprising rules for resolving specific spoken requests; 

processing a spoken request and resolving the spoken request utilizing the 
grammar rules; 

obtaining state and value information regarding specified components of the 
document from the internal representation of the document; and 
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presenting an aural description of the content values associated with document 
components in response to the spoken request. 

2 1 . The method of claim 20, wherein the step of building a grammar 
comprises the step of combining values obtained from data stored in one of local storage, 
remote storage, and a combination thereof, with values obtained from analysis of the 
document. 

22. A program storage device readable by machine, tangibly embodying a 
program of instructions executable by the machine to perform the method steps of claim 



23. A method for aurally spelling out content values associated with 
components of a modality-independent document, the method comprising the steps of: 

preparing an internal representation of a structure and component attributes of the 
modality-independent document; ^ 

building a grammar comprising rules for resolving specific spoken requests; 

processing a spoken request utilizing the grammar rules; 

obtaining state and content value information regarding specified components of 
the document from the internal representation of the document; and 

presenting each character of the content value information requested in response 
to the spoken request. 



20. 
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24. The method of claim 23, wherein the step of presenting each character of 
the content value information comprises the step of inserting pauses between each 
character of the content value information to be presented. 

25. The method of claim 23, wherein the step of building a grammar 
comprises the step of combining values obtained from data stored in one of local storage, 
remote storage, and a combination thereof, with values obtained from an analysis of the 
document. 

26. A program storage device readable by machine, tangibly embodying a 
program of instructions executable by the machine to perform the method steps of claim 
23. 

27. A system for presenting an application in a plurality of modalities, 
comprising: 

a multi-modal manager for parsing a modality-independent document to generate 
a traversal model that maps components of the modality-independent document to at least 
a first and second modality-specific representation; 

a speech user interface manager for rendering and presenting the first 
modality-specific representation in a speech modality; 
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a GUI (graphical user interface) manager for rendering and presenting the second 
modality-specific representation in a GUI modality; 

an event queue monitor for detecting GUI events; 
an event queue for storing captured GUI events; and 

a plurality of methods, that are called by the speech user interface manager, for 
synchronizing I/O (input/output) events across the speech and GUI modalities. 

28. The system of claim 27, wherein the methods for synchronizing I/O events 
comprise a first method for polling for the occurrence of GUI events in the event queue 
and a second method for reflecting speech events back to the GUI manager and posting 
speech events to the multi-modal manager. 

29. The system of claim 27, further comprising a method for invoking 
user-specified programs that are specified in the modality-independent document. 

30. The system of claim 27, wherein the multi-modal manager comprises a 
main renderer that instantiates the GUI manager, the speech user interface manager, and a 
method for capturing GUI events. 

3 1 . The system of claim 27, wherein the speech user interface manager 
comprises JSAPI (java speech application program interface). 
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32. The system of claim 27, wherein the speech user interface manager 
comprises a VoiceXML browser. 

33. The system of claim 32, further comprising a transcoder for generating a 
VoiceXML script from the modality-independent docimient. 
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